Semi-Supervised Word Sense Disambiguation for Mixed-Initiative Conversational Spoken Language Translation

نویسندگان

  • Sankaranarayanan Ananthakrishnan
  • Sanjika Hewavitharana
  • Rohit Kumar
  • Enoch Kan
  • Rohit Prasad
  • Prem Natarajan
چکیده

Lexical ambiguity can cause critical failure in conversational spoken language translation (CSLT) systems due to the wrong sense being presented in the target language. In this paper, we present a framework for improving translation of ambiguous source words that (a) constrains statistical machine translation (SMT) decoding with phrase pair clusters to select a desired sense for translation; (b) automatically predicts the intended sense of an ambiguous source word given its context; and (c) combines the above to define a set of interactive strategies to confirm the intended sense of an ambiguous word and guide the system to the correct translation. The novel use of this framework in a realworld CSLT system distinguishes our approach from the existing work focusing on word sense disambiguation (WSD) for non-interactive, batch-mode SMT. In addition to reporting metrics that evaluate this approach in an interactive spoken language translation system, we also present offline assessments of the component technologies, viz. constrained SMT decoding with sense-specific phrase pair clusters, and automated word sense prediction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Lightly-Supervised Word Sense Translation Error Detection for an Interactive Conversational Spoken Language Translation System

Lexical ambiguity can lead to concept transfer failure in conversational spoken language translation (CSLT) systems. This paper presents a novel, classificationbased approach to accurately detecting word sense translation errors (WSTEs) of ambiguous source words. The approach requires minimal human annotation effort, and can be easily scaled to new language pairs and domains, with only a wordal...

متن کامل

A Review Of Literature On Word Sense Disambiguation

lexical ambiguity is a fundamental characteristic of language. Words can have more than one distinct meaning. Word sense disambiguation is defined as the problem of computationally determining which”sense”of a word is correct in given context. Word sense disambiguation is a task of classification where word senses are the classes, the context provides the evidence, and each occurrence of a word...

متن کامل

Unsupervised Translation Disambiguation for Cross-Domain Statistical Machine Translation

Most attempts at integrating word sense disambiguation with statistical machine translation have focused on supervised disambiguation approaches. These approaches are of limited use when the distribution of the test data differs strongly from that of the training data; however, word sense errors tend to be especially common under these conditions. In this paper we present different approaches t...

متن کامل

Review: Semi-Supervised Learning Methods for Word Sense Disambiguation

Word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying the appropriate sense of a word in a sentence, when the word has multiple meanings. Many approaches have been proposed to solve the problem, of which supervised learning approaches are the most successful. However supervised machine learning are limited by the difficulties...

متن کامل

A Review on Word Sense Disambiguation

Word sense disambiguation (WSD) is described as the job of searching the sense of a word in a situation. WSD is a core problem in many tasks related to language processing. It is aggravated by make use of in several critical utilization like Part-of-Speech tagging, Machine Translation, Information retrieval, etc. Different topics such as ambiguity, evaluation, scalability and diversity cause ch...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013